Speech recognition system, speech recognition server, speech recognition client, their control method, and computer readable memory

ABSTRACT

A user dictionary, which is formed by storing pronunciations and notations of target recognition words designated by the user in correspondence with each other, input speech recognition data, and dictionary management data used to determine the recognition field of a recognition dictionary used in recognition of the speech recognition data are sent to a server via a communication module. In the server, a dictionary management unit looks up an identifier table to determine a recognition dictionary corresponding to the dictionary management information received from a client from a plurality of kinds of recognition dictionaries. A speech recognition module recognizes the speech recognition data using at least the determined recognition dictionary. The recognition result is sent to the client via a communication module.

FIELD OF THE INVENTION

[0001] The present invention relates to a client-server speechrecognition system for recognizing speech input at a client by a server,a speech recognition server, a speech recognition client, their controlmethod, and a computer readable memory.

BACKGROUND OF THE INVENTION

[0002] In recent years, speech is used as an input interface in additionto a keyboard, mouse, and the like.

[0003] However, the recognition rate of speech recognition thatrecognizes input speech lowers and requires a longer processing time asthe number of recognition words which are to undergo speech recognitionbecomes larger. For this reason, in an actual method, a plurality ofrecognition dictionaries or lexicons that register recognition words(e.g., pronunciations and notations) which are to undergo speechrecognition are prepared, and are selectively used (a plurality ofrecognition dictionaries may be used at the same time).

[0004] Also, unregistered words cannot be recognized. As one of methodsfor solving this problem, a user dictionary or lexicon (prepared by theuser to register recognition words which are to undergo speechrecognition) may be used.

[0005] On the other hand, a client-server speech recognition system hasbeen studied to implement speech recognition on a terminal withinsufficient resources.

[0006] These three techniques are known to those who are skilled in theart, but a system that combines these three techniques has not beenrealized yet.

SUMMARY OF THE INVENTION

[0007] The present invention has been made to solve the above problems,and has as its object to provide a speech recognition system which usesa user dictionary in response to a user's request in a client-serverspeech recognition system so as to improve speech input efficiency andto reduce the processing load on the entire system, a speech recognitionserver, a speech recognition client, their control method, and acomputer readable memory.

[0008] According to the present invention, the foregoing object isattained by providing, a client-server speech recognition system forrecognizing speech input at a client by a server,

[0009] the client comprising:

[0010] speech input means for inputting speech;

[0011] user dictionary holding means for holding a user dictionaryformed by registering target recognition words designated by a user; and

[0012] transmission means for transmitting speech data input by saidspeech input means, dictionary management information used to determinea recognition field of a recognition dictionary used to recognize thespeech data, and the user dictionary to the server, and

[0013] the server comprising:

[0014] recognition dictionary holding means for holding a plurality ofkinds of recognition dictionaries prepared for respective recognitionfields;

[0015] determination means for determining one or more recognitiondictionary corresponding to the dictionary management informationreceived from the client from the plurality of kinds of recognitiondictionaries; and

[0016] recognition means for recognizing the speech data using at leastthe recognition dictionary determined by said determination means.

[0017] Other features and advantages of the present invention will beapparent from the following description taken in conjunction with theaccompanying drawings, in which like reference characters designate thesame or similar parts throughout the figures thereof.

BRIEF DESCRIPTION OF THE DRAWINGS

[0018]FIG. 1 is a block diagram showing the hardware arrangement of aspeech recognition system of the first embodiment;

[0019]FIG. 2 is a block diagram showing the functional arrangement ofthe speech recognition system of the first embodiment;

[0020]FIG. 3 shows the configuration of a user dictionary of the firstembodiment;

[0021]FIG. 4 shows a speech input window of the first embodiment;

[0022]FIG. 5 shows an identifier table of the first embodiment;

[0023]FIG. 6 is a flow chart showing the process executed by the speechrecognition system of the first embodiment;

[0024]FIG. 7 shows the configuration of a user dictionary appended withinput form identifiers according to the third embodiment; and

[0025]FIG. 8 shows the configuration of a user dictionary appended withrecognition dictionary identifiers according to the third embodiment.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0026] Preferred embodiments of the present invention will be describedin detail below with reference to the accompanying drawings.

[0027] [First Embodiment]

[0028]FIG. 1 shows the hardware arrangement of a speech recognitionsystem of the first embodiment.

[0029] A CPU 101 systematically controls an entire client 100. The CPU101 loads programs stored in a ROM 102 onto a RAM 103, and executesvarious processes on the basis of the loaded programs. The ROM 102stores various programs of processes to be executed by the CPU 101. TheRAM 103 provides a storage area required to execute various programsstored in the ROM 102.

[0030] A secondary storage device 104 stores an OS and various programs.When the client 100 is implemented using not a general-purpose apparatussuch as a personal computer or the like but a dedicated apparatus, theROM 102 may store the OS and various programs. By loading the storedprograms onto the RAM 103, the CPU 101 can execute processes. As thesecondary storage device 104, a hard disk device, floppy disk drive,CD-ROM, or the like may be used. That is, storage media are notparticularly limited.

[0031] A network I/F (interface) 105 is connected to a network I/F 205of a server 200.

[0032] An input device 106 comprises a mouse, keyboard, microphone, andthe like to allow input of various instructions to processes to beexecuted by the CPU 101, and can be used by simultaneously connectingthese plurality of devices. An output device 107 comprises a display(CRT, LCD, or the like), and displays information input by the inputdevice 106, and display windows which are controlled by variousprocesses executed by the CPU 101. A bus 108 interconnects variousbuilding components of the client 100.

[0033] A CPU 201 systematically controls the entire server 200. The CPU201 loads programs stored in a ROM 202 onto a RAM 203, and executesvarious processes on the basis of the loaded programs. The ROM 202stores various programs of processes to be executed by the CPU 201. TheRAN 203 provides a storage area required to execute various programsstored in the ROM 202.

[0034] A secondary storage device 204 stores an OS and various programs.When the server 200 is implemented using not a versatile apparatus suchas a personal computer or the like but a dedicated apparatus, the ROM202 may store the OS and various programs. By loading the storedprograms onto the RAM 203, the CPU 201 can execute processes. As thesecondary storage device 204, a hard disk device, floppy disk drive,CD-ROM, or the like may be used. That is, storage media are notparticularly limited.

[0035] The network I/F 205 is connected to the network I/F 105 of theclient 100. A bus 206 interconnects various building components of theserver 200.

[0036] The functional arrangement of the speech recognition system ofthe first embodiment will be described below using FIG. 2.

[0037]FIG. 2 is a block diagram showing the functional arrangement ofthe speech recognition system of the first embodiment.

[0038] In the client 100, a speech input module 121 inputs speechuttered by the user via a microphone (input device 106), andA/D-converts input speech data (speech recognition data) which is toundergo speech recognition. A communication module 122 sends a userdictionary 124 a, speech recognition data 124 b, dictionary managementinformation 124 c, and the like to the server 200. Also, thecommunication module 122 receives a speech recognition result of thesent speech recognition data 124 b and the like from the server 200.

[0039] A display module 123 displays the speech recognition resultreceived from the server 200 while storing it in, e.g., an input formwhich is displayed on the output device 107 by the process executed bythe speech recognition system of this embodiment.

[0040] In the server 200, a communication module 221 receives the userdictionary 124 a, speech recognition data 124 b, dictionary managementinformation 124 c, and the like from the client 100. Also, thecommunication module 221 sends the speech recognition result of thespeech recognition data 124 b and the like to the client 100.

[0041] A dictionary management module 223 switches and selects aplurality of kinds of recognition dictionaries 225 (recognitiondictionary 1 to recognition dictionary N, N: a positive integer)prepared for respective recognition fields (e.g., for names, addresses,alphanumeric symbols, and the like), and the user dictionary 124 areceived from the client 100 (may simultaneously use a plurality ofkinds of dictionaries).

[0042] Note that the plurality of kinds of recognition dictionaries 225are prepared for each dictionary management information 124 c (inputform identifier; to be described later) sent from the client 100. Eachrecognition dictionary 225 is appended with a recognition dictionaryidentifier indicating the recognition field of that recognitiondictionary. The dictionary management module 223 manages an identifiertable 223 a that stores the recognition dictionary identifiers and inputform identifiers in correspondence with each other, as shown in FIG. 5.

[0043] A speech recognition module 224 executes speech recognition usingthe recognition dictionary or dictionaries 225 and user dictionary 124 adesignated for speech recognition by the dictionary management module223 on the basis of the speech recognition data 124 b and dictionarymanagement information 124 c received from the client 100.

[0044] Note that the user dictionary 124 a is prepared by the user toregister recognition words which are to undergo speech recognition, andstores pronunciations and notations of words to be recognized incorrespondence with each other, as shown in, e.g., FIG. 3.

[0045] The speech recognition data 124 b may be either speech dataA/D-converted by the speech input module 121 or data obtained byencoding that speech data.

[0046] The dictionary management information 124 c indicates an inputobject and the like. For example, the dictionary management information124 c is an identifier (input form identifier) indicating the type ofinput form when the server 200 recognizes input speech and inputs textdata corresponding to that speech recognition result to each input form,which defines a speech input window displayed by the speech recognitionsystem of the first embodiment, as shown in FIG. 4. The client 100 sendsthis input form identifier to the server 200 as the dictionarymanagement information 124 c. In the server 200, the dictionarymanagement module 223 looks up the identifier table 223 a to acquire arecognition dictionary identifier corresponding to the received inputform identifier, and determines a recognition dictionary 225 to be usedin speech recognition.

[0047] The process executed by the speech recognition system of thefirst embodiment will be explained below using FIG. 6.

[0048]FIG. 6 is a flow chart showing the process executed by the speechrecognition system of the first embodiment.

[0049] In step S101, the client 100 sends the user dictionary 124 a tothe server 200.

[0050] In step S201, the server 200 receives the user dictionary 124 afrom the client 100.

[0051] In step S102, when speech is input to an input form as a targetspeech input, the client 100 sends the input form identifier of thatinput form to the server 200 as the dictionary management information124 c.

[0052] In step S202, the server 200 receives the input form identifierfrom the client 100 as the dictionary management information 124 c.

[0053] In step S203, the server 200 looks up the identifier table 223 ausing the dictionary management information 124 c to acquire arecognition dictionary identifier corresponding to the received inputform identifier, and determines a recognition dictionary 225 to be usedin speech recognition.

[0054] In step S103, the client 100 sends speech recognition data 124 b,which is speech-input as text data to be input to each input form, tothe server 200.

[0055] In step S204, the server 200 receives the speech recognition datacorresponding to each input form from the client 100.

[0056] In step S205, the server 200 executes speech recognition of thespeech recognition data 124 b in the speech recognition module 224 usingthe recognition dictionary 225 and user dictionary 124 a designated forspeech recognition by the dictionary management module 223.

[0057] In the first embodiment, all recognition words contained in theuser dictionary 124 a sent from the client 100 to the server 200 areused in speech recognition by the speech recognition module 224.

[0058] In step S206, the server 200 sends the speech recognition resultobtained by the speech recognition module 224 to the client 100.

[0059] In step S104, the client 100 receives the speech recognitionresult corresponding to each input form from the server 200, and storestext data corresponding to the speech recognition result in thecorresponding input form.

[0060] The client 100 checks in step S105 if the processing is to beended. If the processing is not to be ended (NO in step S105), the flowreturns to step S102 to repeat the processing. On the other hand, if theprocessing is to be ended (YES in step S105), the client 100 informs theserver 200 of end of the processing, and ends the processing.

[0061] It is checked in step S207 if a processing end instruction fromthe client 100 is detected. If no processing end instruction is detected(NO in step S207), the flow returns to step S202 to repeat the aboveprocesses. On the other hand, if the processing end instruction isdetected (YES in step S207), the processing ends.

[0062] In the above processing, when speech is input to an input form asa target speech input, the dictionary management information 124 ccorresponding to that input form is sent from the client 100 to theserver 200. Alternatively, the dictionary management information 124 cmay be sent when the input form as a target speech input is focused byan instruction from the input device 106 (the input form as a targetspeech input is determined).

[0063] In the server 200, speech recognition is made after all speechrecognition data 124 b are received. Alternatively, every time speech isinput as text data to a given input form, that the portion of speechrecognition data 124 b may be sent to the server 200 frame by frame (forexample, one frame is 10 msec speech data), and speech recognition maybe made in real time.

[0064] As described above, according to the first embodiment, in theclient-server speech recognition system, since the server 200 executesspeech recognition of speech recognition data 124 b using both anappropriate recognition dictionary 225 and the user dictionary 124 a,the speech recognition precision in the server 200 can be improved whilereducing the processing load and use of storage resources associatedwith speech recognition in the client 100.

[0065] [Second Embodiment]

[0066] In the first embodiment, if no recognition words to be stored inthe user dictionary 124 a are generated, since the user dictionary 124 aneed not be used, the server 200 may use all recognition words in theuser dictionary 124 a in recognition only when a use request of the userdictionary 124 a is received from the client 100.

[0067] In this case, a flag indicating if the user dictionary 124 a isused is added as the dictionary management information 124 c, thusinforming the server 200 of the presence/absence of use of the userdictionary 124 a.

[0068] [Third Embodiment]

[0069] Since some target recognition words in the user dictionary 124 aare not used depending on an input object, situation, and the like, onlyspecific recognition words in the user dictionary 124 a may be used inrecognition depending on the input object and situation.

[0070] In such case, when the user dictionary is managed by designatinginput form identifiers for respective recognition words, as shown inFIG. 7, only recognition words having an input form identifier of theinput form used in speech input can be used in recognition.Alternatively, a plurality of input form identifiers may be designatedfor a given recognition word. In addition, the user dictionary may bemanaged by designating recognition dictionary identifiers in place ofinput form identifiers, as shown in FIG. 8.

[0071] [Fourth Embodiment]

[0072] By combining the second and third embodiments, the efficiency ofthe speech recognition process of the speech recognition module 224 canbe further improved.

[0073] [Fifth Embodiment]

[0074] Most of the processes of the apparatus of the present inventioncan be implemented by programs. As described above, since the apparatuscan use a general-purpose apparatus such as a personal computer, thepresent invention is also achieved by supplying a storage medium, whichrecords a program code of a software program that can implement thefunctions of the above-mentioned embodiments to a system or apparatus,and reading out and executing the program code stored in the storagemedium by a computer of the system or apparatus. In this case, theprogram code itself read out from the storage medium implements thefunctions of the above-mentioned embodiments, and the storage mediumwhich stores the program code constitutes the present invention. As thestorage medium for supplying the program code, for example, a floppydisk, hard disk, optical disk, magneto-optical disk, CD-ROM, magnetictape, nonvolatile memory card, ROM, and the like may be used.

[0075] The present invention can also be achieved by supplying thestorage medium that records the program code to a computer, andexecuting some or all of actual processes executed by an OS running onthe computer. Furthermore, the functions of the above-mentionedembodiments may be implemented by some or all of actual processingoperations executed by a CPU or the like arranged in a functionextension board or a function extension unit, which is inserted in orconnected to the computer, after the program code read out from thestorage medium is written in a memory of the extension board or unit.When the present invention is applied to the storage medium, thatstorage medium stores a program code corresponding to the flow chartshown in FIG. 3.

[0076] As many apparently widely different embodiments of the presentinvention can be made without departing from the spirit and scopethereof, it is to be understood that the invention is not limited to thespecific embodiments thereof except as defined in the appended claims.

What is claimed is:
 1. A client-server speech recognition system forrecognizing speech input at a client by a server, the client comprising:speech input means for inputting speech; user dictionary holding meansfor holding a user dictionary formed by registering target recognitionwords designated by a user; and transmission means for transmittingspeech data input by said speech input means, dictionary managementinformation used to determine a recognition field of a recognitiondictionary used to recognize the speech data, and the user dictionary tothe server, and the server comprising: recognition dictionary holdingmeans for holding a plurality of kinds of recognition dictionariesprepared for respective recognition fields; determination means fordetermining one or more recognition dictionary corresponding to thedictionary management information received from the client from theplurality of kinds of recognition dictionaries; and recognition meansfor recognizing the speech data using at least the recognitiondictionary determined by said determination means.
 2. The systemaccording to claim 1, wherein said recognition means recognizes thespeech data using the recognition dictionary determined by saiddetermination means, and the user dictionary received from the client.3. The system according to claim 1, wherein said speech input meanscomprises display means for displaying an input form as a target speechinput, and the dictionary management information is an input formidentifier that indicates a type of input form.
 4. The system accordingto claim 1, wherein the dictionary management information containsinformation indicating if the user dictionary is used in recognition ofthe speech data.
 5. The system according to claim 1, wherein the userdictionary is formed by storing pronunciations and notations of thetarget recognition words in correspondence with each other.
 6. Thesystem according to claim 3, wherein the user dictionary is formed byalso storing at least one input form identifier and the targetrecognition words in correspondence with each other.
 7. The systemaccording to claim 1, wherein the user dictionary is formed by alsostoring at least one of recognition dictionary identifiers indicatingrecognition fields of the plurality of kinds of recognitiondictionaries, and the target recognition words.
 8. The system accordingto claim 1, wherein the speech data is data obtained by encoding thatspeech data.
 9. A method of controlling a client-server speechrecognition system for recognizing speech input at a client by a server,comprising: a speech input step of inputting speech; a user dictionaryholding step of holding, in the client, a user dictionary formed byregistering target recognition words designated by a user; and atransmission step of transmitting speech data input in the speech inputstep, dictionary management information used to determine a recognitionfield of a recognition dictionary used to recognize the speech data, andthe user dictionary to the server; a recognition dictionary holding stepof holding, in the server, a plurality of kinds of recognitiondictionaries prepared for respective recognition fields; a determinationstep of determining one or more recognition dictionary corresponding tothe dictionary management information received from the client from theplurality of kinds of recognition dictionaries; and a recognition stepof recognizing the speech data using at least the recognition dictionarydetermined in the determination step.
 10. The method according to claim9, wherein the recognition step includes a step of recognizing thespeech data using the recognition dictionary determined in thedetermination step, and the user dictionary received from the client.11. The method according to claim 9, wherein the speech input stepcomprises a display step of displaying an input form as a target speechinput, and the dictionary management information is an input formidentifier that indicates a type of input form.
 12. The method accordingto claim 9, wherein the dictionary management information containsinformation indicating if the user dictionary is used in recognition ofthe speech data.
 13. The method according to claim 9, wherein the userdictionary is formed by storing pronunciations and notations of thetarget recognition words in correspondence with each other.
 14. Themethod according to claim 11, wherein the user dictionary is formed byalso storing at least one input form identifier and the targetrecognition words in correspondence with each other.
 15. The methodaccording to claim 9, wherein the user dictionary is formed by alsostoring at least one of recognition dictionary identifiers indicatingrecognition fields of the plurality of kinds of recognitiondictionaries, and the target recognition words.
 16. The method accordingto claim 9, wherein the speech data is data obtained by encoding thatspeech data.
 17. A computer readable memory that stores a program codeof control of a client-server speech recognition system for recognizingspeech input at a client by a server, comprising: a program code of aspeech input step of inputting speech; a program code of a userdictionary holding step of holding, in the client, a user dictionaryformed by registering target recognition words designated by a user; anda program code of a transmission step of transmitting speech data inputin the speech input step, dictionary management information used todetermine a recognition field of a recognition dictionary used torecognize the speech data, and the user dictionary to the server; aprogram code of a recognition dictionary holding step of holding, in theserver, a plurality of kinds of recognition dictionaries prepared forrespective recognition fields; a program code of a determination step ofdetermining one or more recognition dictionary corresponding to thedictionary management information received from the client from theplurality of kinds of recognition dictionaries; and a program code of arecognition step of recognizing the speech data using at least therecognition dictionary determined in the determination step.
 18. Aspeech recognition server for recognizing speech input at a client, andsending a recognition result to the client, comprising: reception meansfor receiving, from the client, speech data, dictionary managementinformation used to determine a recognition field of a recognitiondictionary used to recognize the speech data, and a user dictionaryformed by registering target recognition words designated by a user;recognition dictionary holding means for holding a plurality of kinds ofrecognition dictionaries prepared for respective recognition fields;determination means for determining one or more recognition dictionarycorresponding to the dictionary management information received from theclient from the plurality of kinds of recognition dictionaries; andrecognition means for recognizing the speech data using at least therecognition dictionary determined by said determination means.
 19. Theserver according to claim 18, wherein said recognition means recognizesthe speech data using the recognition dictionary determined by saiddetermination means, and the user dictionary received from the client.20. The server according to claim 18, wherein the speech data is dataobtained by encoding that speech data.
 21. A speech recognition clientfor sending input speech to be recognized to a server, and receiving arecognition result of that speech, comprising: speech input means forinputting speech; user dictionary holding means for holding a userdictionary formed by registering target recognition words designated bya user; and transmission means for transmitting speech data input bysaid speech input means, dictionary management information used todetermine a recognition field of a recognition dictionary used torecognize the speech data, and the user dictionary to the server. 22.The client according to claim 21, wherein said speech input meanscomprises display means for displaying an input form as a target speechinput, and the dictionary management information is an input formidentifier that indicates a type of input form.
 23. The client accordingto claim 21, wherein the dictionary management information containsinformation indicating if the user dictionary is used in recognition ofthe speech data.
 24. The client according to claim 21, wherein the userdictionary is formed by storing pronunciations and notations of thetarget recognition words in correspondence with each other.
 25. Theclient according to claim 22, wherein the user dictionary is formed byalso storing at least one input form identifier and the targetrecognition words in correspondence with each other.
 26. The clientaccording to claim 21, wherein the user dictionary is formed by alsostoring at least one of recognition dictionary identifiers indicatingrecognition fields of the plurality of kinds of recognitiondictionaries, and the target recognition words.
 27. The client accordingto claim 21, wherein the speech data is data obtained by encoding thatspeech data.
 28. A method of controlling a speech recognition server forrecognizing speech input at a client, and sending a recognition resultto the client, comprising: a reception step of receiving, from theclient, speech data, dictionary management information used to determinea recognition field of a recognition dictionary used to recognize thespeech data, and a user dictionary formed by registering targetrecognition words designated by a user; a recognition dictionary holdingstep of holding a plurality of kinds of recognition dictionariesprepared for respective recognition fields; a determination step ofdetermining one or more recognition dictionary corresponding to thedictionary management information received from the client from theplurality of kinds of recognition dictionaries; and a recognition stepof recognizing the speech data using at least the recognition dictionarydetermined in the determination step.
 29. The method according to claim28, wherein the recognition step includes a step of recognizing thespeech data using the recognition dictionary determined in thedetermination step, and the user dictionary received from the client.30. The method according to claim 1, wherein the speech data is dataobtained by encoding that speech data.
 31. A method of controlling aspeech recognition client for sending input speech to be recognized to aserver, and receiving a recognition result of that speech, comprising: aspeech input step of inputting speech; a user dictionary holding step ofholding a user dictionary formed by registering target recognition wordsdesignated by a user; and a transmission step of transmitting speechdata input in the speech input step, dictionary management informationused to determine a recognition field of a recognition dictionary usedto recognize the speech data, and the user dictionary to the server. 32.The method according to claim 31, wherein the speech input stepcomprises a display step of displaying an input form as a target speechinput, and the dictionary management information is an input formidentifier that indicates a type of input form.
 33. The method accordingto claim 31, wherein the dictionary management information containsinformation indicating if the user dictionary is used in recognition ofthe speech data.
 34. The method according to claim 31, wherein the userdictionary is formed by storing pronunciations and notations of thetarget recognition words in correspondence with each other.
 35. Themethod according to claim 32, wherein the user dictionary is formed byalso storing at least one input form identifier and the targetrecognition words in correspondence with each other.
 36. The methodaccording to claim 31, wherein the user dictionary is formed by alsostoring at least one of recognition dictionary identifiers indicatingrecognition fields of the plurality of kinds of recognitiondictionaries, and the target recognition words.
 37. The method accordingto claim 31, wherein the speech data is data obtained by encoding thatspeech data.
 38. A computer readable memory that stores a program codeof control of a speech recognition server for recognizing speech inputat a client, and sending a recognition result to the client, comprising:a program code of a reception step of receiving, from the client, speechdata, dictionary management information used to determine a recognitionfield of a recognition dictionary used to recognize the speech data, anda user dictionary formed by registering target recognition wordsdesignated by a user; a program code of a recognition dictionary holdingstep of holding a plurality of kinds of recognition dictionariesprepared for respective recognition fields; a program code of adetermination step of determining one or more recognition dictionarycorresponding to the dictionary management information received from theclient from the plurality of kinds of recognition dictionaries; and aprogram code of a recognition step of recognizing the speech data usingat least the recognition dictionary determined in the determinationstep.
 39. A computer readable memory that stores a program code ofcontrol of a speech recognition client for sending input speech to berecognized to a server, and receiving a recognition result of thatspeech, comprising: a program code of a speech input step of inputtingspeech; a program code of a user dictionary holding step of holding auser dictionary formed by registering target recognition wordsdesignated by a user; and a program code of a transmission step oftransmitting speech data input in the speech input step, dictionarymanagement information used to determine a recognition field of arecognition dictionary used to recognize the speech data, and the userdictionary to the server.
 40. A client-server speech recognition systemfor recognizing speech input at a client by a server, the clientcomprising: a speech input unit inputs speech; a user dictionary holdinga user dictionary formed by registering target recognition wordsdesignated by a user; and a transmitter transmits speech data input bysaid speech input means, dictionary management information used todetermine a recognition field of a recognition dictionary used torecognize the speech data, and the user dictionary to the server, andthe server comprising: a recognition dictionary holding a plurality ofkinds of recognition dictionaries prepared for respective recognitionfields; a determination unit determines one or more recognitiondictionary corresponding to the dictionary management informationreceived from the client from the plurality of kinds of recognitiondictionaries; and a recognition unit recognizes the speech data using atleast the recognition dictionary determined by said determination means.41. A speech recognition server for recognizing speech input at aclient, and sending a recognition result to the client, comprising: areceiver receives, from the client, speech data, dictionary managementinformation used to determine a recognition field of a recognitiondictionary used to recognize the speech data, and a user dictionaryformed by registering target recognition words designated by a user; arecognition dictionary holding a plurality of kinds of recognitiondictionaries prepared for respective recognition fields; a determinationunit determines one or more recognition dictionary corresponding to thedictionary management information received from the client from theplurality of kinds of recognition dictionaries; and a recognition unitrecognizes the speech data using at least the recognition dictionarydetermined by said determination means.
 42. A speech recognition clientfor sending input speech to be recognized to a server, and receiving arecognition result of that speech, comprising: a speech input unitinputs speech; a user dictionary holding a user dictionary formed byregistering target recognition words designated by a user; and atransmitter transmits speech data input by said speech input means,dictionary management information used to determine a recognition fieldof a recognition dictionary used to recognize the speech data, and theuser dictionary to the server.